neural network design

安装量: 122
排名: #7044

安装

npx skills add https://github.com/aj-geddes/useful-ai-prompts --skill 'Neural Network Design'
Neural Network Design
Overview
This skill covers designing and implementing neural network architectures including CNNs, RNNs, Transformers, and ResNets using PyTorch and TensorFlow, with focus on architecture selection, layer composition, and optimization techniques.
When to Use
Designing custom neural network architectures for computer vision tasks like image classification or object detection
Building sequence models for time series forecasting, natural language processing, or video analysis
Implementing transformer-based models for language understanding or generation tasks
Creating hybrid architectures that combine CNNs, RNNs, and attention mechanisms
Optimizing network depth, width, and skip connections for better training and performance
Selecting appropriate activation functions, normalization layers, and regularization techniques
Core Architecture Types
Feedforward Networks (MLPs)
Fully connected layers
Convolutional Networks (CNNs)
Image processing
Recurrent Networks (RNNs, LSTMs, GRUs)
Sequence processing
Transformers
Self-attention based architecture
Hybrid Models
Combining multiple architecture types
Network Design Principles
Depth vs Width
Trade-offs between layers and units
Skip Connections
Residual networks for deeper training
Normalization
Batch norm, layer norm for stability
Regularization
Dropout, L1/L2 preventing overfitting
Activation Functions
ReLU, GELU, Swish for non-linearity PyTorch and TensorFlow Implementation import torch import torch . nn as nn import tensorflow as tf from tensorflow import keras import numpy as np import matplotlib . pyplot as plt

1. Feedforward Neural Network (MLP)

print ( "=== 1. Feedforward Neural Network ===" ) class MLPPyTorch ( nn . Module ) : def init ( self , input_size , hidden_sizes , output_size ) : super ( ) . init ( ) layers = [ ] prev_size = input_size for hidden_size in hidden_sizes : layers . append ( nn . Linear ( prev_size , hidden_size ) ) layers . append ( nn . BatchNorm1d ( hidden_size ) ) layers . append ( nn . ReLU ( ) ) layers . append ( nn . Dropout ( 0.3 ) ) prev_size = hidden_size layers . append ( nn . Linear ( prev_size , output_size ) ) self . model = nn . Sequential ( * layers ) def forward ( self , x ) : return self . model ( x ) mlp = MLPPyTorch ( input_size = 784 , hidden_sizes = [ 512 , 256 , 128 ] , output_size = 10 ) print ( f"MLP Parameters: { sum ( p . numel ( ) for p in mlp . parameters ( ) ) : , } " )

2. Convolutional Neural Network (CNN)

print ( "\n=== 2. Convolutional Neural Network ===" ) class CNNPyTorch ( nn . Module ) : def init ( self ) : super ( ) . init ( )

Conv blocks

self . conv1 = nn . Conv2d ( 3 , 32 , kernel_size = 3 , padding = 1 ) self . bn1 = nn . BatchNorm2d ( 32 ) self . pool1 = nn . MaxPool2d ( 2 , 2 ) self . conv2 = nn . Conv2d ( 32 , 64 , kernel_size = 3 , padding = 1 ) self . bn2 = nn . BatchNorm2d ( 64 ) self . pool2 = nn . MaxPool2d ( 2 , 2 ) self . conv3 = nn . Conv2d ( 64 , 128 , kernel_size = 3 , padding = 1 ) self . bn3 = nn . BatchNorm2d ( 128 ) self . pool3 = nn . MaxPool2d ( 2 , 2 )

Fully connected layers

self . fc1 = nn . Linear ( 128 * 4 * 4 , 256 ) self . dropout = nn . Dropout ( 0.5 ) self . fc2 = nn . Linear ( 256 , 10 ) self . relu = nn . ReLU ( ) def forward ( self , x ) : x = self . relu ( self . bn1 ( self . conv1 ( x ) ) ) x = self . pool1 ( x ) x = self . relu ( self . bn2 ( self . conv2 ( x ) ) ) x = self . pool2 ( x ) x = self . relu ( self . bn3 ( self . conv3 ( x ) ) ) x = self . pool3 ( x ) x = x . view ( x . size ( 0 ) , - 1 ) x = self . relu ( self . fc1 ( x ) ) x = self . dropout ( x ) x = self . fc2 ( x ) return x cnn = CNNPyTorch ( ) print ( f"CNN Parameters: { sum ( p . numel ( ) for p in cnn . parameters ( ) ) : , } " )

3. Recurrent Neural Network (LSTM)

print ( "\n=== 3. LSTM Network ===" ) class LSTMPyTorch ( nn . Module ) : def init ( self , input_size , hidden_size , num_layers , output_size ) : super ( ) . init ( ) self . lstm = nn . LSTM ( input_size , hidden_size , num_layers , batch_first = True , dropout = 0.3 ) self . fc = nn . Linear ( hidden_size , output_size ) def forward ( self , x ) : lstm_out , ( h_n , c_n ) = self . lstm ( x ) last_hidden = h_n [ - 1 ] output = self . fc ( last_hidden ) return output lstm = LSTMPyTorch ( input_size = 100 , hidden_size = 128 , num_layers = 2 , output_size = 10 ) print ( f"LSTM Parameters: { sum ( p . numel ( ) for p in lstm . parameters ( ) ) : , } " )

4. Transformer Block

print ( "\n=== 4. Transformer Architecture ===" ) class TransformerBlock ( nn . Module ) : def init ( self , d_model , num_heads , d_ff , dropout = 0.1 ) : super ( ) . init ( ) self . attention = nn . MultiheadAttention ( d_model , num_heads , dropout = dropout ) self . norm1 = nn . LayerNorm ( d_model ) self . norm2 = nn . LayerNorm ( d_model ) self . feedforward = nn . Sequential ( nn . Linear ( d_model , d_ff ) , nn . ReLU ( ) , nn . Dropout ( dropout ) , nn . Linear ( d_ff , d_model ) , nn . Dropout ( dropout ) ) def forward ( self , x ) :

Self-attention

attn_out , _ = self . attention ( x , x , x ) x = self . norm1 ( x + attn_out )

Feedforward

ff_out

self . feedforward ( x ) x = self . norm2 ( x + ff_out ) return x class TransformerPyTorch ( nn . Module ) : def init ( self , vocab_size , d_model , num_heads , num_layers , d_ff ) : super ( ) . init ( ) self . embedding = nn . Embedding ( vocab_size , d_model ) self . transformer_blocks = nn . ModuleList ( [ TransformerBlock ( d_model , num_heads , d_ff ) for _ in range ( num_layers ) ] ) self . fc = nn . Linear ( d_model , 10 ) def forward ( self , x ) : x = self . embedding ( x ) for block in self . transformer_blocks : x = block ( x ) x = x . mean ( dim = 1 )

Global average pooling

x

self . fc ( x ) return x transformer = TransformerPyTorch ( vocab_size = 1000 , d_model = 256 , num_heads = 8 , num_layers = 3 , d_ff = 512 ) print ( f"Transformer Parameters: { sum ( p . numel ( ) for p in transformer . parameters ( ) ) : , } " )

5. Residual Network (ResNet)

print ( "\n=== 5. Residual Network ===" ) class ResidualBlock ( nn . Module ) : def init ( self , in_channels , out_channels , stride = 1 ) : super ( ) . init ( ) self . conv1 = nn . Conv2d ( in_channels , out_channels , 3 , stride = stride , padding = 1 ) self . bn1 = nn . BatchNorm2d ( out_channels ) self . conv2 = nn . Conv2d ( out_channels , out_channels , 3 , padding = 1 ) self . bn2 = nn . BatchNorm2d ( out_channels ) self . relu = nn . ReLU ( ) self . shortcut = nn . Sequential ( ) if stride != 1 or in_channels != out_channels : self . shortcut = nn . Sequential ( nn . Conv2d ( in_channels , out_channels , 1 , stride = stride ) , nn . BatchNorm2d ( out_channels ) ) def forward ( self , x ) : residual = self . shortcut ( x ) out = self . relu ( self . bn1 ( self . conv1 ( x ) ) ) out = self . bn2 ( self . conv2 ( out ) ) out += residual out = self . relu ( out ) return out class ResNetPyTorch ( nn . Module ) : def init ( self ) : super ( ) . init ( ) self . conv1 = nn . Conv2d ( 3 , 64 , 7 , stride = 2 , padding = 3 ) self . bn1 = nn . BatchNorm2d ( 64 ) self . maxpool = nn . MaxPool2d ( 3 , stride = 2 , padding = 1 ) self . layer1 = self . _make_layer ( 64 , 64 , 3 , stride = 1 ) self . layer2 = self . _make_layer ( 64 , 128 , 4 , stride = 2 ) self . layer3 = self . _make_layer ( 128 , 256 , 6 , stride = 2 ) self . layer4 = self . _make_layer ( 256 , 512 , 3 , stride = 2 ) self . avgpool = nn . AdaptiveAvgPool2d ( ( 1 , 1 ) ) self . fc = nn . Linear ( 512 , 10 ) def _make_layer ( self , in_channels , out_channels , blocks , stride ) : layers = [ ResidualBlock ( in_channels , out_channels , stride ) ] for _ in range ( 1 , blocks ) : layers . append ( ResidualBlock ( out_channels , out_channels ) ) return nn . Sequential ( * layers ) def forward ( self , x ) : x = self . maxpool ( self . bn1 ( self . conv1 ( x ) ) ) x = self . layer1 ( x ) x = self . layer2 ( x ) x = self . layer3 ( x ) x = self . layer4 ( x ) x = self . avgpool ( x ) x = x . view ( x . size ( 0 ) , - 1 ) x = self . fc ( x ) return x resnet = ResNetPyTorch ( ) print ( f"ResNet Parameters: { sum ( p . numel ( ) for p in resnet . parameters ( ) ) : , } " )

6. TensorFlow Keras model with custom layers

print ( "\n=== 6. TensorFlow Keras Model ===" ) tf_model = keras . Sequential ( [ keras . layers . Conv2D ( 32 , ( 3 , 3 ) , activation = 'relu' , input_shape = ( 32 , 32 , 3 ) ) , keras . layers . BatchNormalization ( ) , keras . layers . MaxPooling2D ( ( 2 , 2 ) ) , keras . layers . Conv2D ( 64 , ( 3 , 3 ) , activation = 'relu' ) , keras . layers . BatchNormalization ( ) , keras . layers . MaxPooling2D ( ( 2 , 2 ) ) , keras . layers . Conv2D ( 128 , ( 3 , 3 ) , activation = 'relu' ) , keras . layers . BatchNormalization ( ) , keras . layers . GlobalAveragePooling2D ( ) , keras . layers . Dense ( 256 , activation = 'relu' ) , keras . layers . Dropout ( 0.5 ) , keras . layers . Dense ( 10 , activation = 'softmax' ) ] ) print ( f"TensorFlow Model Parameters: { tf_model . count_params ( ) : , } " ) tf_model . summary ( )

7. Model comparison

models_info

{ 'MLP' : mlp , 'CNN' : cnn , 'LSTM' : lstm , 'Transformer' : transformer , 'ResNet' : resnet , } param_counts = { name : sum ( p . numel ( ) for p in model . parameters ( ) ) for name , model in models_info . items ( ) } fig , axes = plt . subplots ( 1 , 2 , figsize = ( 14 , 5 ) )

Parameter counts

axes [ 0 ] . barh ( list ( param_counts . keys ( ) ) , list ( param_counts . values ( ) ) , color = 'steelblue' ) axes [ 0 ] . set_xlabel ( 'Number of Parameters' ) axes [ 0 ] . set_title ( 'Model Complexity Comparison' ) axes [ 0 ] . set_xscale ( 'log' )

Architecture comparison table

architectures

{
'MLP'
:
'Feedforward, Dense layers'
,
'CNN'
:
'Conv layers, Pooling'
,
'LSTM'
:
'Recurrent, Long-term memory'
,
'Transformer'
:
'Self-attention, Parallel processing'
,
'ResNet'
:
'Residual connections, Skip paths'
}
y_pos
=
np
.
arange
(
len
(
architectures
)
)
axes
[
1
]
.
axis
(
'off'
)
table_data
=
[
[
name
,
architectures
[
name
]
]
for
name
in
architectures
.
keys
(
)
]
table
=
axes
[
1
]
.
table
(
cellText
=
table_data
,
colLabels
=
[
'Model'
,
'Architecture'
]
,
cellLoc
=
'left'
,
loc
=
'center'
,
bbox
=
[
0
,
0
,
1
,
1
]
)
table
.
auto_set_font_size
(
False
)
table
.
set_fontsize
(
9
)
table
.
scale
(
1
,
2
)
plt
.
tight_layout
(
)
plt
.
savefig
(
'neural_network_architectures.png'
,
dpi
=
100
,
bbox_inches
=
'tight'
)
print
(
"\nVisualization saved as 'neural_network_architectures.png'"
)
print
(
"\nNeural network design analysis complete!"
)
Architecture Selection Guide
MLP
Tabular data, simple classification
CNN
Image classification, object detection
LSTM/GRU
Time series, sequential data
Transformer
NLP, long-range dependencies
ResNet
Very deep networks, image tasks Key Design Considerations Input/output shape compatibility Receptive field size for CNNs Sequence length for RNNs Attention head count for Transformers Skip connection placement for ResNets Deliverables Network architecture definition Parameter count analysis Layer-by-layer description Data flow diagrams Performance benchmarks Deployment requirements
返回排行榜